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ABSTRACT . ^ 

Three topics within the domain of regression-based 
analyses of multilevel data from quasi-exper iments and field studies 
in educational research 'and evaluation are presented. The paper 
begins with a discussion of the general question of choice of unit of 
analysis, or what may be the more aj^propr iate* question of choice of 
analytical model. It then provides empirical illustrations of the 
importance of knowing the question of interest. Two additional topics 
are considered: the use of within-group slopes as indices in 
between-group analyses, and the estimation of within-group dependency 
and its role in analysis of multilevel data. These topics reflect 
substantive concerns in school-based non-experimental investigations. 
It is proposed that, in addition to the studies of the variation in 
analytical properties across approaches with hypothetical data, 
alternative approaches shotild be applied to a wide variety and 
sizeable number of actual data sets, each with a potentially 
differing set of inadequacies. In this way, more could be learned 
'about both the methods and the influences of data limitations on 
me thod s . ( Autho r /FN ) 
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This paper focuses on. three topics within the domain of regression^ 
based analyses of multilevel data from quasi-experiments and field studies 
in educational research 3nd evaluation. The paper begins with a discussion 
of the genetal question of choice of unit of analysis or what may be the 
more appropriate question of choice of analytical modfel. After discussing 
this issue and providing empirical illustrations of the importance of knowing 
the question of interest, two additiotnal topics will be considered: the use 
of within-group slopes as indices in between-group analyses and the estimation 
of within-group dependency and ^irts role in analyses of multilevel data. These 
latter topics^ reflect important siAstantive concerns in school-based non- 
experimental investigations. » , 

\ 

> Overall, we believe that the major technical complication in the analysis 
of multilevel data from quasi-experiments and field studies is the inability 
of educational researchers to develop Adequate "theories about educational 
processes within groups (classrooms and schools) and to develop adeqxiate 
methodology for analyzing the educational effects of sjach processes. The 
material presented here reflects an attempt to systematize the* investigation 
of two important indices of within-group process^^s. ' ^ 



Choice of Units of Analysis and/or 

Choice of Analyticai Model 

* ^ 
Efforts to identify the effects of education ( e.g., Coleman, Campbell, 

Hobson, McPartiand, Mood, Weinfeld and York, 1966) on pupil performance 

have suffered from the complications caused by- the multilevel character of 



educational data. Schools are aggregates of their teachers, classrooms and, 
pupils, and classrooms are aggregates of the persons and processes withirl^ ' ^ ^ 



them. This beiog the case, the effects of education can exist both between 
and within the units at each level of the educational system; Yet the 
majority of studies of educational effects have restricted attention to 
either overall between-student, between-class, or betwe en-school analyses. 

Cronbach (1976) argued that the majority of studies^ of educational 
effects carried out thus far conceal more than they reveal, and that "the 
estal^lished methods have generated false conclusions in many studies" (p. 1) • 
His concern is foreshadowed in the educational literature by the exchange 
among Wiley, Bloom, and Glaser as recorded in Wittrock and Wiley (1970), 
and by Haney's (1974) review of tha^ units of analysis problems encountered* 
in- the evaluation of Project Follow Through. 

V 

Research on the differences between multiple regression models at 
different levels of aggregation (Burstein, 1975 , 1978; Hannan and Burstein; 
1974; Hannan and Young, 1976a; Feige and Watts, 1972)and cm the analyses of 
school effects at different levels (Burstein, Fischer, and Miller, 1978; 
Burstein and Smith, 1977; Comber and Keeves., 1973; Hannan, Freeman, and 
Meyer, 1976; Keesling and Wiley, 1974) indicates that (a) there are sub-- 
stantial differences in the magnitudes of regression coefficients across 
levels for specific models; (b) different variables enter the models at 
different levels; and (c) aggregation generally inflates the estimated 
effects of pupil background and decreases the likelihood of identifying 
teacher and classroom characteristics that are effective. The restilts cited 
above are not very comforting for the researcher who wishes to draw con- 

f-\ ■ 

elusions 'about educational processes at one level but is constrained to 
analysis at ^.different level. 
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■ When faced with the analysis of multilevel data, most researchers have 
tried to make a choise among alternative units of analysis on the basis of 
theory or statistical ^considerations. Unfortunately, those who resort to 
theory either reject plausible alternative models (Brophy, 1975; Bloom, 1970; 
Stebbins, St. Pierre, Proper, Anderson and Cerva, 1977 ; Wiley, 1970)^ or find 
themselves unable to choose (Cline, Ames, Anderson, Bale, Ferb, Joshi, Kane, 
Larson, Park, Proper, Stebbins, Stern, 1974; Haney, 1974). Picking th^ 
appropriate unit on the basis of statistical considerations can also leave 
the choice unresolved due to competing alternatives (Burstein and Smith, 
1977; Glendening, 197 6; Haney, 1974). 

Haney (1974) has elaborated the range of alternative considerations in 
the contextof the evaluation of Project Follow Through. He cites four general 
types: the purpose of the evaluation (questions, to be addressed), the evaluation 
design (nature of treatnfents, independence of units and treatment effects, 
appropriate size), statistical considerations (reliability of measures, degrees 
of freedom, analysis techniques), and practical considerations (missing data, 
policy research, multiple year comparisons, economy). Haney was unable to 
choose among untis because the purpose of the evaluation dictated the child 
as the unit but the unit of treatment was the classroom; moreover, the multiyear 
character of Follow'.Through made classrooms impractical as units of analysis. 
And, since there was no random assignment at any level and the comparison 
c^iildren-were not equivalent to- treatment children, these considerations 
offered no relief. 

Apparently*, thinking of multilevel analyses simply as problems in the 
choice of a unit of analysis is inadequate. Phenomena of importance ocqur 
at all levels and need to be described and subjected to inference-making. . 
(Burstein and Linn, 1976; Crqnbach, 1976). Once again, Haney's arguments 
are succinct and to the point: 
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Investigators ought to have a strong .bias for studying 
various properties of the educational system at the level 
at which they occur; . . .variation in attributes of 
interest ought to be studied at those levels (or between 
those units) at which it does Cor is expected to) occur. . . 
If the hypotheses are explicitly stated in terms of 
mathematical models, the impact o£ shifting levels of 
analysis from one unit of analysis to another will be^much 

f 

more easily assessed than if they are not (1974, pp. 96-97). 
These argimients cited by Haney serve as jiistif ication for the research we 
describe throughout this paper. 

Decomposition imto Between-Group and Within-Group Effects 

A variety of competing points can be cited as traditional justification 
for the choice of either pupils or groups (classrooms, schools, etc.) as 
the appropriate unit of analysis in studies of educational effects. Gen- 
erally arguments cited are compelling and virtually irreconcilable if a 
choice of either pupil or group as the only unit is required. The multilevel 
character of educational data warrants analytical strategies tailored to the 
identification of educational effects at and within each level of the edu- 
cational system. Mqreover, the complexity of the choice depends on the 
type of study being conducted as well as the types of outcomes and processes 
under investigation. 

Even in the simplest 'case, once the existence of specific group 
membership is acknowledged (e.g., instruction from a specific teacher), 
any measure that varies over pupils can be decomposed into its betwieen- 
group and within-group components. For example, if we consider the posttest 



or outcome performance, Y^^ , of pupil j in class ^ (j = lv«, 
n persons per class; i 1,..., k classes; for simplicity *we assume equal- 
size classes) and the performance level X^y of the pupil prior to enterin? 
the class (i.e., the, pretest or some measure of entering ability), then 
the relation ^f X^j to Y^^ can be decomposed into between-class and within- 
class components (Burstein, Linn, and Capell,*' 1978; Cronbach, 1976): 



Y - Y « 3, (X - X ) Predicted Between-Class 

ij..bi... 

+ Y^ - 3, (X^ - X ) Adjusted Between-Class 
i. b i. . . 

- + 6 (X - X, ) Pooled Wtthin-Class Slope 
w ij . i. 

H» (B - 3 )(X.. - X. ) Specific Within-Clas^ 
i w xj • 1 . 

+ ' , Specific Residxial Associated 

with Person ij 

In the 'above ec^uation, 6^ is the between-class slope from the 

regression of Y . on X ,6 is the pooled within-class slope from the 

regression of (Y_ - Y . ) on (X., - X. ) across all classrooms, and the 
ij 1. ij 

B are the specific within-class slopes from the regression of Y on 
X^j within the i classrooms. 

The possible substantive interpretations of specific components and 
sets of components are important here. (See particularly descriptions 
of alternative analytical models and the section on slopes as indices). 
The key elements are the between-class slope, the adjusted between-class 
effect, the pooled within-class slope, and the specific within-class slopes. 
Often, Eq.uatlon (1) can be modified *so that we have a global measure, 
"^T^, of classes (e.g., class membership, teacher cjuality, or treatment-group) 
rather than the aggregation of individual scores represented by X^ • 
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In what follows wa shall refer to the effects associated with either 

X *or T as class effects without lo$s of generality, 
i • i ' 

One useful treatment of the multilevel; analysis was provided by 

Cronbach (1976). A succinct statement of Cronbach^s justification ,for 

his proposed analysis is that the usual overall betwe en-student analysis 

combines two kinds of relationships—those operating between collectives 

(reflected in and* adjusted class effects) and those operating among 
b 

persons within collectives (reflected in 3 and 3.)~into a composite that 

^ W X 

is rarely of substantive interest (Cronbach, 1976, pp. 10.j3ff.). Cronbach 
reminds us that 3^, -the overall betwe en-student coefficient from the re- 
gression of on j > 

has been shown by Duncan,- Cuzzort, and Duncan (1961, p. 66) to be a 
composite of 3^ and 3^' 

(3) - n^8^ + (1 - n^)B„ 

where is the intraclass correlation or correlation ratio of X. Cronbach 
(^1976; Cronbach and Webb, 1975) goes on to recommend that between-group 
effects and individuals-wi thin-group ^ffects should, >e examined separately. 
In its most parsimonious form, Cronjjach would examine the following: 

(4) Between Groups : - Y^ ^ "^^yt^l ^^^i. 

where the b-i^ is the effect of teachers on mean outcomes' after controlling 
YT . . » 

for between-class differences in inputs. 

C5) Pooled Within-Groups ; Y - Y^^ - 3^(X ^ \) 



'Thus, Cronbach's primary concerns .are with th^ adjusted collective 
effects, of instruction as reflected by the adjusted class mean outcomes, 
and with the overall redis tributive properties of classroom instruction 
' ^as reflected by the pooled within-class regression, Q^. 

Empirical Results from Multilevels of lEA Data < 

\ For the time being, We focus on the two estimators of most interest 
todronbach, between-group regression coefficients and the corresponding 
^oled within-group coefficients. Recent empirical Analyses ^of data from 
' the lEA Six Subject Survey (Burstein, Fischer, and Miller, 1978) dramatically 
demonstrate the distinct differences in interpretation when , one moves from 
a between-school to a wi thin-school* analysis . This study investigated the 
factors influencing educational achievement in twenty-one countries, con- 
sidering six ''subject areas (Science, Reading Comprehension, Literature, 
Civics Education, English as a Foreign Language, and French as a Foreign 
Language) at three age levels (basically, 10 year-olds, 14 year-olds, and 
. ' students in their preteritary year). Over 700 sbudent, 't'eacher, and school 
characteristics were measured. 

In an investigation of educational effects models for 14 year-olds from 
' the U.S. and Sweden in the lEA science achievement sl;udy (Table 1) we 
found that the effects of family background on science achievement were 
substantial, as usual, in the between-schools analysis of U.S. data but 
much smaller in Sweden. In fact, for 14 year-olds, R was larger^than 

r3 - , in Sweden, which, would be atypical for analysis of U.S. data. 

Between-schools 

In contrast, the effects of family background in the pool ed-wi thin- 
school analyses for the U.S. were substantially smaller and were essen-' 
tially the same as the effect* found in the within-school analysis 
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for Swedeh. One possible substantive explanation for these findings is 
that the two types of analyses reflect, on one hand, distinctions between 
the countries in the political order governing the distribution of pupil 
backgrounds' and school resources (i.e., the predominance of local control i 
and coimnunity determination of school resources in the U.S^ vs. national 
control and a policy of uniformity of resources for Sweden) and, on the 
other, similarity between countries in the operation of the social order 
within schools (i.e., interpersonal allocations of rewarcis within an 
institution) . 

There are further substantive quesions that the above example might 
address, but the methodological point is clear: different types of analysis 
of multilevel data address different questions and typically research on 
schooling asks questions at multiple levels. . ^ ' 

Within-Group Slopes as Indices 
in Between-Group Analyses 

Once it is determined that th^ questions of interest and/or statistical 
considerations warrant analyses of aggregated data, the types of 
between-group effects one expects to find remain to be. specif ied. ' In 
particular, when one's purpose Is- to det^mipe factors affecting pupil 
performance, it is possible that analyses of between-group (class, school, 
etc.) means can hide important difference In the within-group distribution 
of pupil outcomes and educational inputs. 

Several aspects' of' current scho-ling practices lead us to expect ^ 
jthat withtn-school and within-class distributions of pupil performance 
vary. First, schools (classes) do differ in the distribution of educational 



. performance. Moreover, schools with the same mean outcome often exhibit 
ydi^^rent distributions of p^rformance within^chool; An analysis of ^ 
. % s*means alone couJd not be' expected to account f or ^such distributional dif- 

ferexcces, 

^ ' 'Second, a variety of educational theories about/the effects of. specific 

sch^^ling practices on within-group behavior argue for au examination of 

'\ t . 

distributional properties other than -group means. Obviously, at least the 

y 

variability of performance is of interest in studies comparing individu4li2ed, 

competency-based, or open 'educational instructional programs with mor6 

ttaditional instructional practices. Also, research on the interaction 

/;between teaching style and learning style would lead one to expect^ vari- 
' \V . . ^ 

ability of outcomes for pupils tSfith similar entering characteristics and 

pref ereiices taught by -teachers with differing ins true t^ional styles, 

» . • ^ • 

Finally, the idea of using distributional ch|tract eristics in addition 

to the mean as critei^ion, measures has been shoxm previously to merjt con- 
sideration CLofines, 1572; Klitgaard, 1975; Brotm and Saks, 1475)'. Lohnes 
•u^ (1972) found that standard deviations and skewnegs indices added to the 

' explanato"ry power of means in his 'analyses of data from the Cooperative 
Reading Project. Klitgaard (1975) and Brown and Saks (197^5)^ found' that 
school and school district standard deviations exhibited more significant 
relations with school characteristics than did school and school' district 



means, ' ^ ^ ^ 

' Though they sought answers Co different questions and used different 
' methodologies, Lohnes, Brown and Saks, and 'Klitgaard apparently share our 
belief that educational outcomes are multif aceted and incompletely measured 
by single group averages. . There also seems to he consensus that educational 
theory can be developed which will link pupll^ntertng. characteristics and 

'9 
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and characteristics of the educative process to distributional prop-r 
erties of educational outcomes. 

We (Burstein and Linn, 1976; Burstein, Linn and Capell,., 1S78) have 
elaborated a theory for the use of v/ithin-group slopes of outcomes on inputs 
as a criterion inf educational effects, studies. Wiley (1970) may have be^n 
the first to suggest this strategy. ' \ / " ' ^ I' 

Our justification for considering within-group slopes as outcomes 
derives much of its imp&tus from research on aptitude- treatment interactions 
.(Snow, 1976) and from evi'dence of, slope differences among colleges (Rock, 
Baird, and Linn,^1970)^ In its simplest form, we expect that different ' 
combinations of teachers and Instructional praqtices will result in varying 
distributions of educational outcomes for pupils with similar entering 
characteristics, ^ For example, it might be hypothesized that there are 
teachers who are equally effective in obtaining mean performance, ^ut yield 
varying slopes because some teachers use compensatory Instructional practices 
which emphasize the improved performance^of lower-afcility students while 
others allow each child to learn at his/her own rate. (We would expecjt 
a flatter slope in the former case than in the latter.) 

Burstein and Linn (1976; Burstein, Linn and Capell, 1978) compared 
alternative analytical^models for identifying educational effects for 
sets of hypothetical classrooms with heterogeneous slopes. The key 
findings were that, for the conditions -studied, heterogeneous within-clags 
slopes were shown to make important differences in identified effects, ones 
which were not swamped by sampling variability in the estimation of slopes, 
and certain analytical strategies exhibited good properties even in the 
presence of heterogeneity. t 
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Although within-group slopes are conceptually appealing indices of , 
educationial effects, three points warrant further examination. First, 
It must be deteo^nined that slopes are sufficiently stable. Second, it 
must be demonstrated that slopes are potentially distinct from other^ 
group indices (e,g.^ pre and posttest means and standard deviations) in 
^ealistid si^ations. Finally, thert have to be realistic cases in 
which slopes are rel^tfed to school and crlass characteristics after 
controlling for othexf background measures and other indices of group 
outcomes. We have already begun to investigate these points (see below). 

Stability of Slopes 

The sampling variability of within-group slopes is substantially 
greater than that of the mean. ^ For small samples, e.g., the size of a 
classroom, the sampling error of a slope is so large that it is question- 
able whether real difference? in slopes may reasonably be distinguished 
frpm the noise; moreover, any outlier can dominate the slope. If the 
real 'differences In slopes are as large as those generated in Burstein 
and Linn (1976), then it is' Important to take them into account. Whether 
the differences in real classrooms are of similar magnitude is somewhat 
problematic at this stage, however. 

Since students within a classroom are not a random sample, but 
possibly are"~"better thought of as fixed once the classroom is chosen, it is 
i^ot clear how best to ipcvestigate the relative magnitude of signal and 
noise in the differences among within-cXasstoom slopes. Linn and Burstein 
(1977) found little support for the notion. that slopes varied systematically 
when a posttest in reading was regressed on a pretest in reading (Figure. 1) 

1 V 
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apd only limited support for the notion based on a similar a^t of 
regressions for math using small samples of classrooms from the ETS BTES ^ 
study (McDonald and Elias, 1976), But these analyses were based on 
tradlti^l confidence intervals which treated each class , as If the stud^ts 
in it were a random sample from a population. As already noted, the random 
'^fampling model is questionable in this situation. 

While random sampling of students may not provide the best model, 
there is a need to allow for disturbances in the observed slope due to 
idiosyncratic occurrences at the time of m^surement» Just as 'an 
individual's observed score is distinguished from an underlying' true score 
in classical test theory, there is a need to distinguish between the 
observed measure for the group (in this case 'the slope) and an underlying v 
"true" slope. 

Several approaches can be used to investigate the relative size of 
signal and noise in the within-group slope estimates, Considence 
intervals can be computed for the within»gi;oup ^Ippea .for selected sets . 
as was done by Linn and Burstein (1977) for BTES data. The Jacknife 
procedure (Hosteller, and Tukey, 1977) can also be used to estimate slopes 
and confidence. 

Relations of Slopes' to Other Group-Level Indices 

If slopes are to provide a useful addition' to the array of outcomes, 
they must be distinct from other indices* Linn and Burstein (1977) have 
investigated this property of slopes* For three separate data sets (BTES data 
on classrooms collected- by ETS (McDonald and Elias, 1976); Michigan Assessment 
Data on schools reported in Marco (1974); and TEA data on schools (See Table 2) 
they found that though pretest and posttest means correlated with each other 
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in the range of •5-. 8, the correlation of witAin-group slope with either 
means or standard deviations Cor, for that matter, with skewness and kur- 
tosis indices and sample size) are much lower and, except for the pretext 
standard deviations Iwhich are spuriously related to slopes), are rarely 
significant. 

The results cited suggest that slopes are sufficiently distinct from 
means and standard deviations to warrant further consideratiori. 

Relation of Slopes to School and Class Characteristics - 

The final line in an investigation of the potential utility of 
' within-group slopes involves their relationships to measures of school 

and classroom processes. Preliminary results of an analysis of science 

achievement data on U.S. 14-year-olds in the lEA study (Bursfein, 1978) 
provided^^tan^alizing evidence of the possible payoff from this activity. 
Burstein found that the wi thin-school slopes of science achievement on 
a verbal ability measure , (assessed concurrently) were significantly and 
positively related to school mean responses of pupils on indices of ex- 
posure to science instruction and of the degree to which pupils reported 
instructional practices which emphasized exploration — discovery methods 
of instruction. (See Table 3). These significant results occurred despite 
controls for pretest and posttest means and standard deviations and pupil 
home background measures. 

The results described above fit in well with recent research on 
informal/open/ individually-guided/ unstructured instruction (see particularly 
Rosenshine (1978) and Stebbins and others (1977)). Instruction which 
emphasizes student self-direction (selection) of learning goals and methods 
tends to exacerbate pre-existing differences in pupil skills. Higher-abiliJCy 
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students tend to make more appropriate choices and achieve at a faster 
rate than lower-ability students. ' 
^ The steeper within-group slopes^ with -greater opportunities for 

exposure to instruction and with greater emphasis on individual exploration 
cited above are consistent, with expectations from, other research and suggest 
the need for similar investigations with other data sets. 

Estimating Within-Group Dependency 
in Multilevel Analysis 

The Problem of Depend^cy among Observations within Units 

The problem-of-depea^eace-^o ng observations with i n groups is endemi c — . 
to research on hierarchically nested school data, and can be especially 
criticfal when intact classrooms are investigated. Cronbach and Webb 
(Cronbach, 197^; .Cronbach and Webb, 1975*; Webb, 1977) have argued that 
when intact groups are assigned to instructional treatments, the student;s 
in those treatments 6annot be considered independent units and therefore, 
the typical analy,aft^ased on all individuals pooled across"^ groups can be 
justifiably criticized as inappropriate. 

The crucial problem in ignoring group membership is that educational, 
treatments are not administered independently to individuals (Wiley, 197Q). 
Individuals within the classroom have shared experiences. This non- in- 
dependence of individuals within the group can be expressed by an intra- 

if 

ciass correlation structure. The consequences of ignoring this intraglass 
structure (i.e., treating individuals as independent by ignoring group 
membership) are serious (Walsh, 1947; Weibull, ^953). 
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Recent work by Glendening (1976> provides a thorough discussion of 
the problem in the experimental design ftame of reference, (The work of 
Glass and Stanley (1970), and Peckhaii, Glass and Hopkins (1969) is also 
summarized by Glendening 1976,) Glendening simulated the effects 
of violating the assumption of independence within the context of a balanced 
two-level hierarchically-nested design, with subjects (S) nested within 
classrooms (C) and classrooms nested within .treatments (T). She 6^r- 
ationally defined independence as that condition wherein the expected mean 
square between classrooms, EMS(C:T), equals that within classrooms, E^^StS:CT)• 
She found that a model with the pupil as the unit or a conditional model where 
preliminary test of independence is followed by a choice of unit of analysis 



for testing treatment effects, yielded spuriously small error terras and tK ^e- 



fore, too liberal tests of treatment effects, Glendening concluded' that the 
researcher must choose a priori between the class ^(dependence) or student 
(Independence) as the unit, but acknpwJLedged the complications of obtaining 
prior knowledge about independence of response. 

. While Glendening and Porter focused on the implications 'on in^raclass 
correlation for the analysis of experimental data, Webb (1977) was con- 
cerned with the antecedents of such intraclass relations in research on 
group process, Webb compared learning in interacting groups and learning 
singly, attempting to explain differences as a function of the character- 
istics of the individual, the group,. and the group process. The gi^oup 
process results provided a key to understanding why some students learned^ 
best in interacting groups, whereas others, did best learning singly. In 
general, group members who actively participated in discussions did better 
than those who did not actively participate, and did at least as well as 
after individual learning. Whether a pupil actively participated was 



15 

ERIc , 18^ 



V 

* delated to the pupil ability ranking within the group and the range 
and level of ability in the group. Knofwing the abilities of the students 
in a group, one could predict fairly well wh- interacted with whom and, 
consequently, who did best. 

The results^of this highly structured study suggest that knowledge 
of group prd'cesses in a particudLar class is'.dpiclal f or-und,er.standing the 
degree to which students are workinb together — and therefore crucial for 
estimating degree of dependence in the class. Studying group process may 
be the only way to get at this dependence. Unless student's in a class 
are receiving completely individualized instruction, rarely will it be 
tenable to base, analyses on the assumption of an intraclass correlation 
of zero. Unless all students are receiving exactly the same instruction 
and interact with fellow students in the same manner and amounts of ti^e, 
an intraclass correlation of one is unreasonable. Examination of lower -level 
processes will help locate the intraclass correlation on the continuum 
between 0 and 1 . 

Webb suggests that the above procedures may be generalized to real 
teacher-taught classrooms, considering interactions between teacher and 
^tudents, interactions among students and characteristics of students 
(abilities, personality variables) and teachers. In the long run, one 
hopes to be able to predict student performance from a combination of 
these variables. 

Clearly, research on most educational phenomena will involve dependent 
observations. Moreover, dependence cannot 'J^e viewed as an all-or none 
^^phenomenon— it is a matter of degree. It depends on what is being 
measured (the outcome) and the "treatments" or "causes" under study. 
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It is also a function of the composition of the units and the nature of 
the. grouping mechanism as Webb (1977) has demonstrated* Therefore tests 
for independence and adjustments fop intraclass correlations are more 
^appealing than automatic aggregation to the classroom level. 

Analytical methods are needed which will account for the degree of 
dependency and make adjustments, where appropriate, to the estimated effects 
and associated estimates of precision. Moreover, estimators of dependency 
may be useful as indicators of classroom process. That is, it may be 
possible to relate these estimated relationships to characteristics of 
studSnts,* te,achers', and instructional context* 

Concluding Remarks 
The topics discussed in this paper are a subset of a broader range of 

' issues and p^roblems which require more attention over the next few years* 

% ■ 
Table 4 lists'^a variety of types of studies and types of outcomes for which 

multilevel an^ysi^ 'issues Jnust be resolved. It is unclear what form the 

final products of the inveV'tigation of' the analysis of multilevel data will 

take, but it is/*possible to imagine the following scenario. As a preamble, 

we point to a trend (ieveXoping in educational evaluation for the conduct of 

what Glass ^(1976) has termed "meta-analyses" (see also Light and Smith, 197i)* 

r 

Persons conducting me t^f- analyses seek to accumulate knowledge about the 

impact and characteristics of a particular educational innovation by ag- 

gregating findings across numerous investigations of the phenomena* 
^. - ^ 
, There would seem to be a natural parallel to meta-analysis which is 
*" *• - , " 

relevant to the .examination of alternative methodological approaches for 

the analysis of multilevel data. There ^re two key obstacles to the 

development of appropriate methodologies in this context. First, the 

i 
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available methodological approaches vary greatly in the degree to which 
they are theory-based as opposed to ad hoc * Second, all currently avail- 
able empirical data sets suffer from a variety of inadequacies which, 
taken singly, limit their utility for comparing alternative methodological 
approaches. 

We believe- that it is important to identify approaches which are 
practically viable as well as theoretically sound and which are usable 
with actual as well as h3rpothetical data. Therefore, we propose that in 
addition to the studies of the variation in analytical properties across 
approaches with hypothetical data, the alternative approaches should be 
applied to a wide variety and sizeable number of actual data sets, each 
with a potentially differing set of inadequacies. In this way we .hope to 
learn more about both the methods (e.g., which are more generally usable; 
which behave, similarly for specific kinds of da-ta sets) and the influence 
of data limitations on methods Ce.g., the exclusion of what types of in-' 
formation makes different approaches impractical or unattractive). 
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-.Reading on Pretest in Reading for 33 Second-Grade Classrooms 
"J-fiSata, from McDonald & Elias, 1976) 



Source: Linn & Burstein, 1977 
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_ Table 1. 



Between-student, between-school and pooled within-school regression analyses of factors 
affecting science achievement (RSCI) for 14-year-olds from the lEA study in the United 
states 



' Metric Regression Coefficient^ 
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Srhnnl 


ruu±ca w J. uu J.U 


'ocnool 


Between 


Student 


Variable 


United 




United 




United 




States 


(J W C U C ii 




Sweden 


States 


Sweden 


Sex 


-6.620 


-3.362+ 


-3.853* 


-5.281 


-4.157 


-5.165 










/ 1 /. 1 Q ^ 


*< /II Q 7 \ 


^ 1 J . Do; 


Work Knowledge 


.876 


.569+ 


.812 


.773^ 


.861 


.754 












(23.22) 


/to A 0\ 

(18.02) 


Father*^ Occupation 


.843 


.194+ 


.307* 


.297 


.487 


.256 




{1.0/ ) 




(J.9i; 




(0. J9; 




Number of Books 'in Home 


3.57/ 


1.217+ 


1.223* 


K324 


1.661 


1..324 




(3.37) 


(.92) 


(5.36) 


<5.04) 


(7.20) 


(4.99) 


Grade 


1.39a 


4.186+ 


2.291* 


2,941* 


1.912 


3.083 




(1.69) 


(2.49) 


(5.25) 


(7.25) 


(5.24) 


(7.65) 


Science Study 


.110 


-.149+ 


.065* 


-.122* 


.066 


-.125 


•« 






n 90^ 






f6 16'J 




?40 


. 382+ 


067* 




130 


_ n?i4 




(1.50) 


(1.10) 


(1.25) 


(1.31) 


(2.52) 


(.85)'??*^ ^' 




72 


31 


31 


34 


39 


34 
















Number of 


107 


93 










Schools ^ 














Number of 


1806 


1675 










Students 















The between-school analyses are run with each school weighted by the numbers of students. However, all t- 
statistics were adjusted to reflect the number of schools rather than the number of students. 

{-statistics are reported in parentheses. 
+ variable for which between country differences were significant at p<.05. 



*Withln-count3;ry variables for which the between school and with-school coefficients differ by 
at least twc standard errors. 
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Table 2. Correlations among descriptive stati-stics from lEA 
data for the United States (N « 107) • 
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Y 


X 
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N 


Mean ^ 




e 




.10 


.14 


.59* 


.13 


.29* 


.85 




Y 


-.02 




.88* 


-.09 


-.24 


.27 


177.74 


R 


ic 


.09 


.93* 




-.12 


-.34* 


.20 


152.50 


E 
A 
0 


s 

7 


.48* 


-.18 


* --21 




• .78* 


-.38* 


18.31 


I 
N 


S 

X 


-.04 




.02 


.65* 




.31* 


18.41 


6 


N 


-.35* 




.19 


-.19 


-.05 








Mean 


.90 


173.20 


156.27 


34.03 


31.78 




■ 1 



SOURCE: Burstein, 1978. 
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Table 3» School -level regressions of, means, standard deviations, and slopes on background and school 
characteristics fgr the United States Population II, TEA Study* 



Dependent Variables 
Metric Coefficient ' Standardized Coefficient 



IndpT^pndATi t 
Variable 


ocience 
Mean 


ocience 
SD 


Slope^ 


Science 
Mean 


Science 
SD 


Slope 


Sex 


-6 744 
(-4.24)'' 


-2 44*^ 
(2.26) 


- fins 

. DUO 

(-2.43) 


— . jVU 




-.236 


Word Knowledge 


(8.33) 


1 no 
(1.27) 


• UDo 

(2.94) 


. 640 


. 126 


.282 


Pather^s Occupation 


.437 
(1.50) 


.306 
(1.55) 


' .014 
(.31) 


.149 


.153 


.031 


nuuiDcL^ OX oooKs in noine 


/. RIO 

(4.84) 


1 . 5/4 

(2.49) 


-.295 
(-.20) 


.436 


.241 


-.020 


Science Study 
• 


.074 
(1.72) 


.038 
(1.29) 


.021 
(3.17) 


■. 169 


.128 


.302 


Exploratory Methods 


.317 
(1.94) 


.226 
(2.04) 


.072 
(2.81) 


.191 

* 


.200 . 


.270 




.76 


.34 


.27 








Slopes from vlthin^school 


regressions 


of science 


score on word knowledge score 







t-*statistics in parentheses 



SOURCE: Burstein^and MlJLler, 1978 



Table 4« Classifications of types of studies and types of outcomes for 

the Investigation of educational effects. 



1. TYPE OF STUDY 
A. MANIPULATION 

1. EXPERIMENTAL/TRUE ~ "Units" assigned to alternative treatments 

or Treatment/Non-treatments ; some' form of 
manipulation 

a. Random Assignment of Pupils from Classrooms to Treatments — 
Pupils randomly assigned to treatment conditions; treatment 
outside of normal class routine; treatment non-group work 

b. Random Assignment of Pupils from Classrooms to Groups — 
Pupils randomly assigned to treatment groups ; treatment 
outside normal class routine 

c. Random Assignment of Pupils to Classes — Pupils randomly 
assigned to classes; classes randomly assigned to treatments 

d. Random Assignment of Partial Classes to Treatments — Portions 
of class randomly assigned to different treatment conditions. 

e. Random Assignment of Intact Classes to Treatments — Students 
assigned to classes on unknown non-random basis; intact classes 
assigned to treatments 

2. EXPERIMENTAL/ATI — Conditions under I with additional question 
of interaction with entering characteristics 

3. EXPERIMENTAL/LONGITUDINAL A- Repeated measurement (mastery testing, 
sequential- analysis of behavior and interaction patterns, persis- 
tence) in context of empirical studies 

* 

B. NON-MANIPULATION 

1. NON- EXPERIMENTAL/CROSS-SECTIONAL Large-scale cross-sectional 
survey of pupils, teachers /classrooms , schools, etc. for purpose 
of establishing educational school/teacher effects model. 

2. NON-EXPERIMENTAL/LONGn:UDINAL — Large-scale longitudinal survey 
(e.g., income maintenance, voucher study. Follow Through Evaluation) 

3. NON- EXPERIMENTAL/OUTLIER CRESIDUAL) ANALYSIS ~ Develop indices 
of effects of system over and beyond what can be anticipated by 
entering characteristics 




23 



Table 4 Continued 



NON-EXPERIMENTAL/ CONTEXTUAL (COMPOSITIONAL) EFFECT ~ Examination 

of whether the composition/f rog-pond/normative climate of institution 

has an effect* 

II. TYPE OF OUTCOME 

A» SHORT TERM ~ Duration of a lesson to, say, a year 

1. Specific Cognitive Objective — Single content domain/objective 
in an instructional sequence 

1. General Cognitive Objective — Standardized achievement test 
or total score over multiple objectives CRM 

3^ Affective Objective — Attitude toward self and subject matter, 
efficiency 

4. Group Behavior Peer soci-alization, group cohesiveness , group 
interaction 

B» LONG TERM — Duration of multiple years, retrospective academic 
antecedents 

1. General Cognitive Outcome — Standardized test or cumulative ' 
grades (e.g.^ SAT as outcome prediction of future grades from 
earlier test scores) 

2. Educational Attainment — Level of education 

3. Occupational Attainment — Level of occupation (social strati- 
fication theory) 

4. Career Plans /Career Satisfaction 

5. General Mental Health 
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Footnote 

^An earlier version of this paper was prekented at the Institute for 
Research on Teaching, Michigan State University, East Lansing, Michigan, 
December 10. 1977. This paper presents work partially supported by the 
National Institute of Education contract NIE G-78-0113 with the Center 
for the Study of Evaluation, University^f California. Los Angeles, and 
by a grant from the Spencer Foundation to the Graduate School of Education. 
University of California. Los Angeles. The contents of. the paper in.no 
way reflect official opinions of the organizations mentioned above and 
they are not responsible for the interpretations made herein. 
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